Policy Improvement
DynamoGuard provides multiple ways to improve your guardrail policies over time. You can edit policy definitions, upload additional training examples, and incorporate real-world feedback from monitoring logs. Once updates are made, you can retrain the policy using one of three workflows:
- Regenerate Training Data (full training data refresh)
- Augment Data (targeted, smaller continual training update)
- Train Without Regeneration (no data changes beyond adding your own data)
These workflows give users more control and provide stability when working with small, targeted updates or large dataset uploads.
| Workflow | Description | Recommended for |
|---|---|---|
| Regenerate Training Data | Use when you’ve made substantial updates to your policy definition or need a broad reset of synthetic data. This discards all existing synthetic datapoints and regenerates a full synthetic dataset aligned with the updated definition and examples. | Major definition revisions; initial policy development; failure modes spanning many categories |
| Augment Data | Expands the existing dataset with additional synthetically generated samples based on newly uploaded data or feedback, without regenerating the entire dataset. This preserves the existing training data while adding new synthetic datapoints to improve targeted failure modes without destabilizing overall performance. | Post-release fine-tuning; narrow or well-identified failure modes; small dataset uploads (< 50 datapoints); avoiding large shifts in policy behavior |
| Train Without Regenerated or Augmented Data | Retrains the model using the existing training dataset as-is, with no synthetic regeneration or augmentation. The current training data is reused unchanged. | Very large dataset uploads (> 1k datapoints); minimal changes to the policy definition |
RLHF Guardrail Improvement
After modifying a policy by changing its definition, adding new examples, or applying feedback, you can activate those changes by retraining the policy.

Editing Policies
To trigger the RLHF guardrail improvement workflow (either the regeneration or augmentation options), you can edit a policy’s definition, training data, and provide feedback from monitoring logs.
Editing Policy Definitions
Policy definitions can be edited from the Policy Management page. Here, edits can be made to the title, description, or associated behaviors. Allowed and prohibited behaviors can also be added and removed.

After editing the definition, the UI will recommend regeneration, but you may choose to:
- Regenerate synthetic data: it is recommended to regenerate training data to ensure it aligns with the updated definition.
- Train without regeneration: to do this, click “Dismiss” or go straight to the “Training” tab and click “Train on Validated Changes”

Adding Training Data
Training data can also directly be added to a policy by clicking the 'Add Data' button in the Training Data tab. Here, data can either be provided one at a time by providing the example and compliance status (”Add Manually”) or by uploading a file containing a larger set of datapoints (”Upload Datapoints”).

When uploading data, you may choose to Augment Training Data using a checkbox in the upload dialog. When selected, augmentation will generate additional synthetically varied datapoints based on the uploaded examples.

If you do not select to augment data, a pop-up will appear that provides the option to regenerate training data. If you do not want to regenerate data, select "Dismiss."

Editing Datapoints
Existing training data can directly be edited by clicking into the particular datapoint. Here, the datapoint text or compliance status can be edited.

Adding Feedback Data
After deploying a policy, you can collect feedback on how it performs in the live environment:
- Navigate to the Monitoring Log to view feedback on datapoints
- This feedback can be applied to the policy for continuous improvement

To apply this feedback, navigate to your policy page, to the "Incoming Feedback" tab. Here, you can select which feedback datapoints you want to apply.

When applying feedback, a confirmation allows you to choose among the following options:
- Augment and Apply Feedback
- Apply Feedback Only
- Cancel

If augmentation is selected, DynamoGuard will generate additional synthetic datapoints derived from your feedback samples (e.g., variations of observed false positives/negatives).
Note: Feedback cannot be applied while regeneration or augmentation is in progress.